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The  challenge 


The  High  Speed  Signal  Processing 
(HSSP)  project 

-  A  joint  project  between  Ericsson  Microwave  Systems 
and  Halmstad  University,  Sweden 


The  AESA  performance  should  fit  in  the  same 
“box”  as  today’s  systems,  considering 

-  Physical  size 

-  Power  dissipation 

-  Physical  robustness 


The  goal  of  HSSP:  “1  TFLOPS  in  a  shoe  box” 


High  Speed  Signal  Processing  project 


•  Research  for  the  FUTURE 


-  Embedded  high  speed  signal  processing  computer  systems  for  the  next 
generation  fighter  aircraft  radar. 
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Communication 


-  Strengthen  our  competence  to  ensure  realization  in  the  future 

-  Find  engineer  efficient  and  economic  solutions 

-  Actively  cooperate  in  a  wide  competence  network 
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System  realization 


A  multi-module  system  concept 

-  SIMD  compute  engines  for  high  performance 

-  MIMD  on  system  level  for  flexibility 

-  Identical  compute  engines 


•  Realizable  with  0.13  jim  technology  (LSI  Logic  G13) 

•  The  system  is  based  on  in-house  SIMD  based  ASICs  (the  compute 
engines) 


•  The  modules  are  interconnected  in  a  ring  topology  by 
a  high  speed  communication  network  (GLVDS) 

•  The  system  scales  to  >1  TFLOPS 
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HSSP  system 


ERICSSON 


HSSP  system 


•  Five  HSSP  cards  (cassettes) 

•  High  speed  ring  network 

•  Utility  bus 

•  Front  end  (FE)  with  opto- 
interface 

•  Back-end  (BE)  with  utility 
bus  interface 

•  Performance:  >1  TFLOPS 
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HSSP  card  (cassette) 


•  8  ASIC  nodes  per  board,  4 
on  each  side 

•  Double  direction  GLVDS 
ring  network  with  separate 
data  (1.6  GB/s)  and  control 
channel  (100  MB/s) 

•  Utility  bus 

•  DRAM 

•  Performance:  200  GFLOPS 
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ASIC  node 


Serio' 


•  Two  processor  arrays, 
acting  as  co-processors 

•  Master  processor,  IP-core 
running  a  commercial 
RTOS 

•  l/O-processor  (DMA,  data 
transformations,  etc.) 

•  Support  functions  (Boot, 
UART,  Timer,  etc.) 

•  0.1 3(im  technology 
minimum 

•  Performance:  25  GFLOPS 
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Processor  array  (PA) 


•  32  processor  elements,  400 
MHz,  ring  topology 

•  32  kB  memory  per  element 

•  Custom  control  unit  w/ 
memory 

•  Performance:  12.5  GFLOPS 
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Processor  element  (PE) 


•  64  32-bit  registers,  4  read 
and  3  write  ports 

•  4  stage  pipelined  FPU,  IEEE 
754 

•  fmul,  fadd,  fsub,  mask 
operations 

•  North/South  communication 
interface 

•  64  bit  memory  access, 
skewed  load  and  store,  3.2 
GB/s  BW 

•  Performance:  400  MFLOPS 
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VLSI  test  implementation 


•  One  processor  array 

•  Clock  and  control  distribution 

•  LSI  Logics  G12  process  (0.18 
jam),  standard  cell 

•  Total  area  227  mm2 

-  Clearly  dominated  by  memories 

-  Memory  size  can  however  be 
substantially  decreased 

•  Control  unit  and  processor 
elements  capable  of 

335  and  396  MHz,  respective 

•  Top  level  design  capable  of  210 
MHz 

•  Control  distribution  a  bottle 
neck,  can  however  be  pipelined 
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